346 research outputs found
Towards Scalable Network Delay Minimization
Reduction of end-to-end network delays is an optimization task with
applications in multiple domains. Low delays enable improved information flow
in social networks, quick spread of ideas in collaboration networks, low travel
times for vehicles on road networks and increased rate of packets in the case
of communication networks. Delay reduction can be achieved by both improving
the propagation capabilities of individual nodes and adding additional edges in
the network. One of the main challenges in such design problems is that the
effects of local changes are not independent, and as a consequence, there is a
combinatorial search-space of possible improvements. Thus, minimizing the
cumulative propagation delay requires novel scalable and data-driven
approaches.
In this paper, we consider the problem of network delay minimization via node
upgrades. Although the problem is NP-hard, we show that probabilistic
approximation for a restricted version can be obtained. We design scalable and
high-quality techniques for the general setting based on sampling and targeted
to different models of delay distribution. Our methods scale almost linearly
with the graph size and consistently outperform competitors in quality
Indexing the Earth Mover's Distance Using Normal Distributions
Querying uncertain data sets (represented as probability distributions)
presents many challenges due to the large amount of data involved and the
difficulties comparing uncertainty between distributions. The Earth Mover's
Distance (EMD) has increasingly been employed to compare uncertain data due to
its ability to effectively capture the differences between two distributions.
Computing the EMD entails finding a solution to the transportation problem,
which is computationally intensive. In this paper, we propose a new lower bound
to the EMD and an index structure to significantly improve the performance of
EMD based K-nearest neighbor (K-NN) queries on uncertain databases. We propose
a new lower bound to the EMD that approximates the EMD on a projection vector.
Each distribution is projected onto a vector and approximated by a normal
distribution, as well as an accompanying error term. We then represent each
normal as a point in a Hough transformed space. We then use the concept of
stochastic dominance to implement an efficient index structure in the
transformed space. We show that our method significantly decreases K-NN query
time on uncertain databases. The index structure also scales well with database
cardinality. It is well suited for heterogeneous data sets, helping to keep EMD
based queries tractable as uncertain data sets become larger and more complex.Comment: VLDB201
A Latent Parameter Node-Centric Model for Spatial Networks
Spatial networks, in which nodes and edges are embedded in space, play a
vital role in the study of complex systems. For example, many social networks
attach geo-location information to each user, allowing the study of not only
topological interactions between users, but spatial interactions as well. The
defining property of spatial networks is that edge distances are associated
with a cost, which may subtly influence the topology of the network. However,
the cost function over distance is rarely known, thus developing a model of
connections in spatial networks is a difficult task.
In this paper, we introduce a novel model for capturing the interaction
between spatial effects and network structure. Our approach represents a unique
combination of ideas from latent variable statistical models and spatial
network modeling. In contrast to previous work, we view the ability to form
long/short-distance connections to be dependent on the individual nodes
involved. For example, a node's specific surroundings (e.g. network structure
and node density) may make it more likely to form a long distance link than
other nodes with the same degree. To capture this information, we attach a
latent variable to each node which represents a node's spatial reach. These
variables are inferred from the network structure using a Markov Chain Monte
Carlo algorithm.
We experimentally evaluate our proposed model on 4 different types of
real-world spatial networks (e.g. transportation, biological, infrastructure,
and social). We apply our model to the task of link prediction and achieve up
to a 35% improvement over previous approaches in terms of the area under the
ROC curve. Additionally, we show that our model is particularly helpful for
predicting links between nodes with low degrees. In these cases, we see much
larger improvements over previous models
- …